Overview

Dataset statistics

Number of variables29
Number of observations1419263
Missing cells1609486
Missing cells (%)3.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory314.0 MiB
Average record size in memory232.0 B

Variable types

Numeric3
Categorical24
Boolean2

Alerts

Rndrng_Prvdr_Last_Org_Name has a high cardinality: 180859 distinct values High cardinality
Rndrng_Prvdr_First_Name has a high cardinality: 49837 distinct values High cardinality
Rndrng_Prvdr_Crdntls has a high cardinality: 9299 distinct values High cardinality
Rndrng_Prvdr_St1 has a high cardinality: 243256 distinct values High cardinality
Rndrng_Prvdr_St2 has a high cardinality: 41274 distinct values High cardinality
Rndrng_Prvdr_City has a high cardinality: 10579 distinct values High cardinality
Rndrng_Prvdr_State_Abrvtn has a high cardinality: 61 distinct values High cardinality
Rndrng_Prvdr_Zip5 has a high cardinality: 19042 distinct values High cardinality
Rndrng_Prvdr_Type has a high cardinality: 99 distinct values High cardinality
HCPCS_Cd has a high cardinality: 4604 distinct values High cardinality
HCPCS_Desc has a high cardinality: 4239 distinct values High cardinality
Tot_Benes has a high cardinality: 3691 distinct values High cardinality
Tot_Srvcs has a high cardinality: 10080 distinct values High cardinality
Tot_Bene_Day_Srvcs has a high cardinality: 5073 distinct values High cardinality
Avg_Sbmtd_Chrg has a high cardinality: 128266 distinct values High cardinality
Avg_Mdcr_Alowd_Amt has a high cardinality: 68010 distinct values High cardinality
Avg_Mdcr_Pymt_Amt has a high cardinality: 59738 distinct values High cardinality
Avg_Mdcr_Stdzd_Amt has a high cardinality: 53642 distinct values High cardinality
Rndrng_Prvdr_Ent_Cd is highly correlated with Rndrng_Prvdr_Type and 2 other fieldsHigh correlation
Rndrng_Prvdr_Type is highly correlated with Rndrng_Prvdr_Ent_Cd and 1 other fieldsHigh correlation
Rndrng_Prvdr_MI is highly correlated with Rndrng_Prvdr_Ent_CdHigh correlation
Rndrng_Prvdr_Gndr is highly correlated with Rndrng_Prvdr_Ent_CdHigh correlation
Place_Of_Srvc is highly correlated with Rndrng_Prvdr_TypeHigh correlation
Rndrng_Prvdr_Gndr is highly correlated with Rndrng_Prvdr_TypeHigh correlation
Rndrng_Prvdr_Ent_Cd is highly correlated with Rndrng_Prvdr_TypeHigh correlation
Rndrng_Prvdr_State_Abrvtn is highly correlated with Rndrng_Prvdr_State_FIPS and 3 other fieldsHigh correlation
Rndrng_Prvdr_State_FIPS is highly correlated with Rndrng_Prvdr_State_Abrvtn and 1 other fieldsHigh correlation
Rndrng_Prvdr_RUCA is highly correlated with Rndrng_Prvdr_State_Abrvtn and 2 other fieldsHigh correlation
Rndrng_Prvdr_RUCA_Desc is highly correlated with Rndrng_Prvdr_State_Abrvtn and 1 other fieldsHigh correlation
Rndrng_Prvdr_Cntry is highly correlated with Rndrng_Prvdr_State_AbrvtnHigh correlation
Rndrng_Prvdr_Type is highly correlated with Rndrng_Prvdr_Gndr and 2 other fieldsHigh correlation
Place_Of_Srvc is highly correlated with Rndrng_Prvdr_TypeHigh correlation
Rndrng_Prvdr_First_Name has 61069 (4.3%) missing values Missing
Rndrng_Prvdr_MI has 438461 (30.9%) missing values Missing
Rndrng_Prvdr_Crdntls has 110474 (7.8%) missing values Missing
Rndrng_Prvdr_Gndr has 61069 (4.3%) missing values Missing
Rndrng_Prvdr_St2 has 935854 (65.9%) missing values Missing
Rndrng_Prvdr_RUCA is highly skewed (γ1 = 22.65286645) Skewed

Reproduction

Analysis started2022-01-15 10:22:18.326433
Analysis finished2022-01-15 10:24:57.378226
Duration2 minutes and 39.05 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Rndrng_NPI
Real number (ℝ≥0)

Distinct629503
Distinct (%)44.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1499135331
Minimum1003000134
Maximum1992999825
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.8 MiB
2022-01-15T10:24:57.566583image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1003000134
5-th percentile1053303964
Q11245657278
median1497945216
Q31740622699
95-th percentile1942687785
Maximum1992999825
Range989999691
Interquartile range (IQR)494965421

Descriptive statistics

Standard deviation287769373.3
Coefficient of variation (CV)0.1919569017
Kurtosis-1.197802575
Mean1499135331
Median Absolute Deviation (MAD)242787854
Skewness-0.01354769181
Sum2.127667307 × 1015
Variance8.281121223 × 1016
MonotonicityIncreasing
2022-01-15T10:24:57.917176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
136647909990
 
< 0.1%
153814491089
 
< 0.1%
113427749489
 
< 0.1%
189173162687
 
< 0.1%
106349745186
 
< 0.1%
193214577884
 
< 0.1%
120587204180
 
< 0.1%
153810536679
 
< 0.1%
179072153878
 
< 0.1%
188163061477
 
< 0.1%
Other values (629493)1418424
99.9%
ValueCountFrequency (%)
10030001342
 
< 0.1%
10030001421
 
< 0.1%
10030005221
 
< 0.1%
10030005302
 
< 0.1%
10030005979
< 0.1%
10030006391
 
< 0.1%
10030007381
 
< 0.1%
10030008291
 
< 0.1%
10030009364
< 0.1%
10030010411
 
< 0.1%
ValueCountFrequency (%)
19929998253
< 0.1%
19929997751
 
< 0.1%
19929995512
 
< 0.1%
19929994371
 
< 0.1%
19929992701
 
< 0.1%
19929991225
< 0.1%
19929987361
 
< 0.1%
19929987101
 
< 0.1%
19929987021
 
< 0.1%
19929986452
 
< 0.1%

Rndrng_Prvdr_Last_Org_Name
Categorical

HIGH CARDINALITY

Distinct180859
Distinct (%)12.7%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
Patel
 
8615
Smith
 
7432
Lee
 
5285
Johnson
 
5247
Walgreen Co
 
5082
Other values (180854)
1387602 

Length

Max length70
Median length6
Mean length7.346388936
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62917 ?
Unique (%)4.4%

Sample

1st rowCibull
2nd rowCibull
3rd rowKhalil
4th rowWeigand
5th rowSemonche

Common Values

ValueCountFrequency (%)
Patel8615
 
0.6%
Smith7432
 
0.5%
Lee5285
 
0.4%
Johnson5247
 
0.4%
Walgreen Co5082
 
0.4%
Miller4341
 
0.3%
Brown3869
 
0.3%
Shah3820
 
0.3%
Williams3758
 
0.3%
Jones3523
 
0.2%
Other values (180849)1368291
96.4%

Length

2022-01-15T10:24:58.340015image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
llc18825
 
1.2%
inc17233
 
1.1%
patel8629
 
0.5%
smith7527
 
0.5%
of7435
 
0.5%
center6518
 
0.4%
co6464
 
0.4%
cvs6408
 
0.4%
walgreen5618
 
0.3%
johnson5333
 
0.3%
Other values (168303)1520522
94.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_First_Name
Categorical

HIGH CARDINALITY
MISSING

Distinct49837
Distinct (%)3.7%
Missing61069
Missing (%)4.3%
Memory size10.8 MiB
Michael
 
32465
David
 
29756
John
 
28498
Robert
 
23059
James
 
20998
Other values (49832)
1223418 

Length

Max length20
Median length6
Mean length5.934750853
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16460 ?
Unique (%)1.2%

Sample

1st rowThomas
2nd rowThomas
3rd rowRashid
4th rowFrederick
5th rowAmanda

Common Values

ValueCountFrequency (%)
Michael32465
 
2.3%
David29756
 
2.1%
John28498
 
2.0%
Robert23059
 
1.6%
James20998
 
1.5%
William16196
 
1.1%
Mark15693
 
1.1%
Richard13650
 
1.0%
Thomas13179
 
0.9%
Christopher13022
 
0.9%
Other values (49827)1151678
81.1%
(Missing)61069
 
4.3%

Length

2022-01-15T10:24:58.838964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
michael32578
 
2.4%
david29824
 
2.2%
john28716
 
2.1%
robert23085
 
1.7%
james21047
 
1.5%
william16237
 
1.2%
mark15750
 
1.2%
richard13666
 
1.0%
thomas13226
 
1.0%
christopher13057
 
1.0%
Other values (46627)1160765
84.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_MI
Categorical

HIGH CORRELATION
MISSING

Distinct31
Distinct (%)< 0.1%
Missing438461
Missing (%)30.9%
Memory size10.8 MiB
A
113222 
M
104021 
J
90634 
L
73318 
S
62189 
Other values (26)
537418 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowL
2nd rowL
3rd rowJ
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
A113222
 
8.0%
M104021
 
7.3%
J90634
 
6.4%
L73318
 
5.2%
S62189
 
4.4%
R61945
 
4.4%
D56073
 
4.0%
E52808
 
3.7%
C52680
 
3.7%
B37775
 
2.7%
Other values (21)276137
19.5%
(Missing)438461
30.9%

Length

2022-01-15T10:24:59.157156image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a113222
11.5%
m104021
 
10.6%
j90634
 
9.2%
l73318
 
7.5%
s62189
 
6.3%
r61945
 
6.3%
d56073
 
5.7%
e52808
 
5.4%
c52680
 
5.4%
b37775
 
3.9%
Other values (17)276137
28.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_Crdntls
Categorical

HIGH CARDINALITY
MISSING

Distinct9299
Distinct (%)0.7%
Missing110474
Missing (%)7.8%
Memory size10.8 MiB
M.D.
449381 
MD
442868 
D.O.
52425 
DO
 
35988
PA-C
 
32275
Other values (9294)
295852 

Length

Max length20
Median length4
Mean length3.434573487
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4280 ?
Unique (%)0.3%

Sample

1st rowM.D.
2nd rowM.D.
3rd rowM.D.
4th rowMD
5th rowDO

Common Values

ValueCountFrequency (%)
M.D.449381
31.7%
MD442868
31.2%
D.O.52425
 
3.7%
DO35988
 
2.5%
PA-C32275
 
2.3%
DPM17232
 
1.2%
CRNA15249
 
1.1%
NP14802
 
1.0%
O.D.14577
 
1.0%
PA13471
 
0.9%
Other values (9289)220521
15.5%
(Missing)110474
 
7.8%

Length

2022-01-15T10:24:59.494925image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
m.d474813
34.6%
md453968
33.1%
d.o56024
 
4.1%
do36518
 
2.7%
pa-c33637
 
2.5%
pt17609
 
1.3%
dpm17597
 
1.3%
dpt16412
 
1.2%
crna15741
 
1.1%
np15228
 
1.1%
Other values (3539)233307
17.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_Gndr
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing61069
Missing (%)4.3%
Memory size10.8 MiB
M
910650 
F
447544 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowM
5th rowF

Common Values

ValueCountFrequency (%)
M910650
64.2%
F447544
31.5%
(Missing)61069
 
4.3%

Length

2022-01-15T10:24:59.804748image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-15T10:25:00.015705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
m910650
67.0%
f447544
33.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_Ent_Cd
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
I
1358194 
O
 
61069

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowI
2nd rowI
3rd rowI
4th rowI
5th rowI

Common Values

ValueCountFrequency (%)
I1358194
95.7%
O61069
 
4.3%

Length

2022-01-15T10:25:00.146511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-15T10:25:00.341249image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
i1358194
95.7%
o61069
 
4.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_St1
Categorical

HIGH CARDINALITY

Distinct243256
Distinct (%)17.1%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
200 1st St Sw
 
4081
9500 Euclid Ave
 
2440
4500 San Pablo Rd S
 
1509
75 Francis St
 
1408
200 Hawkins Dr
 
1385
Other values (243251)
1408440 

Length

Max length55
Median length17
Mean length17.92835225
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94418 ?
Unique (%)6.7%

Sample

1st row2650 Ridge Ave
2nd row2650 Ridge Ave
3rd row4126 N Holland Sylvania Rd
4th row1565 Saxon Blvd
5th row1021 Park Ave

Common Values

ValueCountFrequency (%)
200 1st St Sw4081
 
0.3%
9500 Euclid Ave2440
 
0.2%
4500 San Pablo Rd S1509
 
0.1%
75 Francis St1408
 
0.1%
200 Hawkins Dr1385
 
0.1%
5323 Harry Hines Blvd1359
 
0.1%
12605 E 16th Ave1357
 
0.1%
13400 E Shea Blvd1335
 
0.1%
1 Medical Center Dr1314
 
0.1%
1515 Holcombe Blvd1313
 
0.1%
Other values (243246)1401762
98.8%

Length

2022-01-15T10:25:00.552662image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
st326601
 
5.9%
ave257019
 
4.7%
rd229133
 
4.2%
dr170876
 
3.1%
ste167024
 
3.0%
n120869
 
2.2%
blvd118828
 
2.2%
w113201
 
2.1%
e107111
 
1.9%
s102917
 
1.9%
Other values (47201)3790751
68.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_St2
Categorical

HIGH CARDINALITY
MISSING

Distinct41274
Distinct (%)8.5%
Missing935854
Missing (%)65.9%
Memory size10.8 MiB
Suite 200
 
18639
Suite 100
 
17586
Suite 101
 
9444
Suite 300
 
8839
Suite A
 
7813
Other values (41269)
421088 

Length

Max length55
Median length9
Mean length11.48792224
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16117 ?
Unique (%)3.3%

Sample

1st rowEvanston Hospital
2nd rowEvanston Hospital
3rd rowSuite 220
4th rowSuite 102
5th rowSuite 203

Common Values

ValueCountFrequency (%)
Suite 20018639
 
1.3%
Suite 10017586
 
1.2%
Suite 1019444
 
0.7%
Suite 3008839
 
0.6%
Suite A7813
 
0.6%
Suite 2017629
 
0.5%
Suite B4964
 
0.3%
Suite 1024914
 
0.3%
Ste 1004514
 
0.3%
Suite 14444
 
0.3%
Other values (41264)394623
27.8%
(Missing)935854
65.9%

Length

2022-01-15T10:25:00.952028image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
suite272728
24.5%
ste68253
 
6.1%
10025514
 
2.3%
20025508
 
2.3%
of15974
 
1.4%
30013954
 
1.3%
10113741
 
1.2%
floor12895
 
1.2%
a12602
 
1.1%
department12287
 
1.1%
Other values (19369)641035
57.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_City
Categorical

HIGH CARDINALITY

Distinct10579
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
New York
 
15341
Houston
 
12432
Chicago
 
10829
Philadelphia
 
9909
Boston
 
9177
Other values (10574)
1361575 

Length

Max length28
Median length9
Mean length8.93935303
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1689 ?
Unique (%)0.1%

Sample

1st rowEvanston
2nd rowEvanston
3rd rowToledo
4th rowDeltona
5th rowQuakertown

Common Values

ValueCountFrequency (%)
New York15341
 
1.1%
Houston12432
 
0.9%
Chicago10829
 
0.8%
Philadelphia9909
 
0.7%
Boston9177
 
0.6%
Baltimore8898
 
0.6%
Dallas8715
 
0.6%
Springfield8501
 
0.6%
Columbus8388
 
0.6%
Los Angeles8037
 
0.6%
Other values (10569)1319036
92.9%

Length

2022-01-15T10:25:01.322356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
city34531
 
1.9%
new29178
 
1.6%
san22958
 
1.3%
beach17876
 
1.0%
york17235
 
0.9%
fort15986
 
0.9%
saint13649
 
0.8%
park13567
 
0.7%
houston12568
 
0.7%
west12416
 
0.7%
Other values (8304)1626737
89.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_State_Abrvtn
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct61
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
CA
110538 
FL
102444 
TX
96614 
NY
 
90202
PA
 
64824
Other values (56)
954641 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIL
2nd rowIL
3rd rowOH
4th rowFL
5th rowPA

Common Values

ValueCountFrequency (%)
CA110538
 
7.8%
FL102444
 
7.2%
TX96614
 
6.8%
NY90202
 
6.4%
PA64824
 
4.6%
IL57438
 
4.0%
OH52053
 
3.7%
NC51560
 
3.6%
MI47291
 
3.3%
NJ44655
 
3.1%
Other values (51)701644
49.4%

Length

2022-01-15T10:25:01.658019image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca110538
 
7.8%
fl102444
 
7.2%
tx96614
 
6.8%
ny90202
 
6.4%
pa64824
 
4.6%
il57438
 
4.0%
oh52053
 
3.7%
nc51560
 
3.6%
mi47291
 
3.3%
nj44655
 
3.1%
Other values (51)701644
49.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_State_FIPS
Real number (ℝ≥0)

HIGH CORRELATION

Distinct56
Distinct (%)< 0.1%
Missing853
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean28.36898711
Minimum1
Maximum78
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.8 MiB
2022-01-15T10:25:01.960249image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q113
median28
Q342
95-th percentile51
Maximum78
Range77
Interquartile range (IQR)29

Descriptive statistics

Standard deviation15.65965276
Coefficient of variation (CV)0.5519990087
Kurtosis-1.133610218
Mean28.36898711
Median Absolute Deviation (MAD)14
Skewness-0.0001138981059
Sum40238855
Variance245.2247246
MonotonicityNot monotonic
2022-01-15T10:25:02.300422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6110465
 
7.8%
12102515
 
7.2%
4896600
 
6.8%
3690185
 
6.4%
4264823
 
4.6%
1757433
 
4.0%
3952005
 
3.7%
3751558
 
3.6%
2647268
 
3.3%
3444652
 
3.1%
Other values (46)700906
49.4%
ValueCountFrequency (%)
124523
 
1.7%
22699
 
0.2%
429812
 
2.1%
515542
 
1.1%
6110465
7.8%
820485
 
1.4%
918111
 
1.3%
105366
 
0.4%
113787
 
0.3%
12102515
7.2%
ValueCountFrequency (%)
78218
 
< 0.1%
723575
 
0.3%
6924
 
< 0.1%
66233
 
< 0.1%
601
 
< 0.1%
562562
 
0.2%
5525798
1.8%
548755
 
0.6%
5329439
2.1%
5139681
2.8%

Rndrng_Prvdr_Zip5
Categorical

HIGH CARDINALITY

Distinct19042
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
77030
 
4678
55905
 
4119
72205
 
3279
63110
 
2840
44195
 
2830
Other values (19037)
1401517 

Length

Max length5
Median length5
Mean length4.926417443
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2412 ?
Unique (%)0.2%

Sample

1st row60201
2nd row60201
3rd row43623
4th row32725
5th row18951

Common Values

ValueCountFrequency (%)
770304678
 
0.3%
559054119
 
0.3%
722053279
 
0.2%
631102840
 
0.2%
441952830
 
0.2%
782292683
 
0.2%
761042514
 
0.2%
191042451
 
0.2%
606112410
 
0.2%
191072390
 
0.2%
Other values (19032)1389069
97.9%

Length

2022-01-15T10:25:02.677354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
770304678
 
0.3%
559054119
 
0.3%
722053279
 
0.2%
631102840
 
0.2%
441952830
 
0.2%
782292683
 
0.2%
761042514
 
0.2%
191042451
 
0.2%
606112410
 
0.2%
191072390
 
0.2%
Other values (19035)1389076
97.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_RUCA
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct22
Distinct (%)< 0.1%
Missing853
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean1.603277684
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.8 MiB
2022-01-15T10:25:02.959099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile4
Maximum99
Range98
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.648754583
Coefficient of variation (CV)2.275809499
Kurtosis596.5544717
Mean1.603277684
Median Absolute Deviation (MAD)0
Skewness22.65286645
Sum2274105.1
Variance13.31341
MonotonicityNot monotonic
2022-01-15T10:25:03.245217image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
11204683
84.9%
496432
 
6.8%
233643
 
2.4%
730459
 
2.1%
1.120985
 
1.5%
1010406
 
0.7%
56091
 
0.4%
4.14926
 
0.3%
7.12002
 
0.1%
31749
 
0.1%
Other values (12)7034
 
0.5%
ValueCountFrequency (%)
11204683
84.9%
1.120985
 
1.5%
233643
 
2.4%
2.1591
 
< 0.1%
31749
 
0.1%
496432
 
6.8%
4.14926
 
0.3%
56091
 
0.4%
5.1157
 
< 0.1%
6504
 
< 0.1%
ValueCountFrequency (%)
991674
 
0.1%
10.3125
 
< 0.1%
10.2564
 
< 0.1%
10.1396
 
< 0.1%
1010406
0.7%
9633
 
< 0.1%
8.215
 
< 0.1%
8.17
 
< 0.1%
81476
 
0.1%
7.2892
 
0.1%

Rndrng_Prvdr_RUCA_Desc
Categorical

HIGH CORRELATION

Distinct15
Distinct (%)< 0.1%
Missing853
Missing (%)0.1%
Memory size10.8 MiB
Metropolitan area core: primary flow within an urbanized area of 50,000 and greater
1204683 
Micropolitan area core: primary flow within an urban cluster of 10,000 to 49,999
 
96432
Metropolitan area high commuting: primary flow 30% or more to a urbanized area of 50,000 and greater
 
33643
Small town core: primary flow within an urban cluster of 2,500 to 9,999
 
30459
Secondary flow 30% to <50% to a larger urbanized area of 50,000 and greater
 
21576
Other values (10)
 
31617

Length

Max length100
Median length83
Mean length82.74279651
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMetropolitan area core: primary flow within an urbanized area of 50,000 and greater
2nd rowMetropolitan area core: primary flow within an urbanized area of 50,000 and greater
3rd rowMetropolitan area core: primary flow within an urbanized area of 50,000 and greater
4th rowSecondary flow 30% to <50% to a larger urbanized area of 50,000 and greater
5th rowMetropolitan area core: primary flow within an urbanized area of 50,000 and greater

Common Values

ValueCountFrequency (%)
Metropolitan area core: primary flow within an urbanized area of 50,000 and greater1204683
84.9%
Micropolitan area core: primary flow within an urban cluster of 10,000 to 49,99996432
 
6.8%
Metropolitan area high commuting: primary flow 30% or more to a urbanized area of 50,000 and greater33643
 
2.4%
Small town core: primary flow within an urban cluster of 2,500 to 9,99930459
 
2.1%
Secondary flow 30% to <50% to a larger urbanized area of 50,000 and greater21576
 
1.5%
Rural areas: primary flow to a tract outside a urbanized area of 50,000 and greater or UC10406
 
0.7%
Secondary flow 30% to <50% to a urbanized area of 50,000 and greater7488
 
0.5%
Micropolitan high commuting: primary flow 30% or more to a urban cluster of 10,000 to 49,9996091
 
0.4%
Metropolitan area low commuting: primary flow 10% to <30% to a urbanized area of 50,000 and greater1749
 
0.1%
Unknown1674
 
0.1%
Other values (5)4209
 
0.3%

Length

2022-01-15T10:25:03.593868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
area2616052
14.0%
flow1416736
 
7.6%
of1416736
 
7.6%
primary1386076
 
7.4%
core1331574
 
7.1%
within1331574
 
7.1%
an1331574
 
7.1%
and1279545
 
6.9%
greater1279545
 
6.9%
urbanized1279545
 
6.9%
Other values (29)3983274
21.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_Cntry
Categorical

HIGH CORRELATION

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
US
1419166 
JP
 
21
CA
 
11
JO
 
8
DE
 
7
Other values (14)
 
50

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS

Common Values

ValueCountFrequency (%)
US1419166
> 99.9%
JP21
 
< 0.1%
CA11
 
< 0.1%
JO8
 
< 0.1%
DE7
 
< 0.1%
LB6
 
< 0.1%
TH6
 
< 0.1%
GB6
 
< 0.1%
TR5
 
< 0.1%
IT5
 
< 0.1%
Other values (9)22
 
< 0.1%

Length

2022-01-15T10:25:03.899331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
us1419166
> 99.9%
jp21
 
< 0.1%
ca11
 
< 0.1%
jo8
 
< 0.1%
de7
 
< 0.1%
lb6
 
< 0.1%
th6
 
< 0.1%
gb6
 
< 0.1%
it5
 
< 0.1%
tr5
 
< 0.1%
Other values (9)22
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rndrng_Prvdr_Type
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct99
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
Diagnostic Radiology
175963 
Internal Medicine
142736 
Family Practice
131193 
Nurse Practitioner
98738 
Physician Assistant
 
62410
Other values (94)
808223 

Length

Max length55
Median length17
Mean length17.88965611
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPathology
2nd rowPathology
3rd rowAnesthesiology
4th rowFamily Practice
5th rowInternal Medicine

Common Values

ValueCountFrequency (%)
Diagnostic Radiology175963
 
12.4%
Internal Medicine142736
 
10.1%
Family Practice131193
 
9.2%
Nurse Practitioner98738
 
7.0%
Physician Assistant62410
 
4.4%
Cardiology59880
 
4.2%
Physical Therapist in Private Practice44709
 
3.2%
Orthopedic Surgery40875
 
2.9%
Anesthesiology34588
 
2.4%
Ophthalmology33476
 
2.4%
Other values (89)594695
41.9%

Length

2022-01-15T10:25:04.223122image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
medicine196255
 
7.1%
practice185058
 
6.7%
radiology183415
 
6.6%
diagnostic180522
 
6.5%
internal142736
 
5.1%
family131193
 
4.7%
nurse118571
 
4.3%
practitioner98738
 
3.5%
surgery84413
 
3.0%
cardiology74034
 
2.7%
Other values (150)1387052
49.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
True
1418812 
False
 
451
ValueCountFrequency (%)
True1418812
> 99.9%
False451
 
< 0.1%
2022-01-15T10:25:04.607091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

HCPCS_Cd
Categorical

HIGH CARDINALITY

Distinct4604
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
99213
 
65930
99214
 
64193
99204
 
26519
99203
 
25383
99232
 
25211
Other values (4599)
1212027 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique779 ?
Unique (%)0.1%

Sample

1st row88341
2nd row88348
3rd row99213
4th row99204
5th row90662

Common Values

ValueCountFrequency (%)
9921365930
 
4.6%
9921464193
 
4.5%
9920426519
 
1.9%
9920325383
 
1.8%
9923225211
 
1.8%
G000820180
 
1.4%
9922319237
 
1.4%
9923317891
 
1.3%
9921517817
 
1.3%
9921216686
 
1.2%
Other values (4594)1120216
78.9%

Length

2022-01-15T10:25:04.735847image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
9921365930
 
4.6%
9921464193
 
4.5%
9920426519
 
1.9%
9920325383
 
1.8%
9923225211
 
1.8%
g000820180
 
1.4%
9922319237
 
1.4%
9923317891
 
1.3%
9921517817
 
1.3%
9921216686
 
1.2%
Other values (4594)1120216
78.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

HCPCS_Desc
Categorical

HIGH CARDINALITY

Distinct4239
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
Established patient office or other outpatient visit, typically 15 minutes
 
65930
Established patient office or other outpatient, visit typically 25 minutes
 
64193
New patient office or other outpatient visit, typically 45 minutes
 
26519
New patient office or other outpatient visit, typically 30 minutes
 
25383
Subsequent hospital inpatient care, typically 25 minutes per day
 
25211
Other values (4234)
1212027 

Length

Max length256
Median length61
Mean length61.42576323
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique688 ?
Unique (%)< 0.1%

Sample

1st rowSpecial stained specimen slides to examine tissue
2nd rowElectron microscopy for diagnosis
3rd rowEstablished patient office or other outpatient visit, typically 15 minutes
4th rowNew patient office or other outpatient visit, typically 45 minutes
5th rowVaccine for influenza for injection into muscle

Common Values

ValueCountFrequency (%)
Established patient office or other outpatient visit, typically 15 minutes65930
 
4.6%
Established patient office or other outpatient, visit typically 25 minutes64193
 
4.5%
New patient office or other outpatient visit, typically 45 minutes26519
 
1.9%
New patient office or other outpatient visit, typically 30 minutes25383
 
1.8%
Subsequent hospital inpatient care, typically 25 minutes per day25211
 
1.8%
Administration of influenza virus vaccine20180
 
1.4%
Vaccine for influenza for injection into muscle19985
 
1.4%
Initial hospital inpatient care, typically 70 minutes per day19237
 
1.4%
Subsequent hospital inpatient care, typically 35 minutes per day17891
 
1.3%
Established patient office or other outpatient, visit typically 40 minutes17817
 
1.3%
Other values (4229)1116917
78.7%

Length

2022-01-15T10:25:05.076504image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
of766603
 
6.0%
or502526
 
3.9%
minutes461110
 
3.6%
typically376220
 
2.9%
visit322261
 
2.5%
patient304113
 
2.4%
and251909
 
2.0%
other244826
 
1.9%
outpatient236534
 
1.8%
office236255
 
1.8%
Other values (3503)9103205
71.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
1331891 
True
 
87372
ValueCountFrequency (%)
False1331891
93.8%
True87372
 
6.2%
2022-01-15T10:25:05.314646image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Place_Of_Srvc
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
O
875453 
F
543810 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowF
4th rowO
5th rowO

Common Values

ValueCountFrequency (%)
O875453
61.7%
F543810
38.3%

Length

2022-01-15T10:25:05.450043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-15T10:25:05.683964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
o875453
61.7%
f543810
38.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Tot_Benes
Categorical

HIGH CARDINALITY

Distinct3691
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
11
 
69540
12
 
61948
13
 
55465
14
 
50197
15
 
45533
Other values (3686)
1136580 

Length

Max length7
Median length2
Mean length2.193651212
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1707 ?
Unique (%)0.1%

Sample

1st row93
2nd row22
3rd row24
4th row25
5th row93

Common Values

ValueCountFrequency (%)
1169540
 
4.9%
1261948
 
4.4%
1355465
 
3.9%
1450197
 
3.5%
1545533
 
3.2%
1641787
 
2.9%
1738336
 
2.7%
1835546
 
2.5%
1932649
 
2.3%
2031025
 
2.2%
Other values (3681)957237
67.4%

Length

2022-01-15T10:25:05.828546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1169540
 
4.9%
1261948
 
4.4%
1355465
 
3.9%
1450197
 
3.5%
1545533
 
3.2%
1641787
 
2.9%
1738336
 
2.7%
1835546
 
2.5%
1932649
 
2.3%
2031025
 
2.2%
Other values (3681)957237
67.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Tot_Srvcs
Categorical

HIGH CARDINALITY

Distinct10080
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
12
 
44049
13
 
42054
11
 
40839
14
 
39678
15
 
37129
Other values (10075)
1215514 

Length

Max length11
Median length2
Mean length2.34128206
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5287 ?
Unique (%)0.4%

Sample

1st row322
2nd row22
3rd row28
4th row25
5th row99

Common Values

ValueCountFrequency (%)
1244049
 
3.1%
1342054
 
3.0%
1140839
 
2.9%
1439678
 
2.8%
1537129
 
2.6%
1634801
 
2.5%
1732394
 
2.3%
1830377
 
2.1%
1928456
 
2.0%
2027091
 
1.9%
Other values (10070)1062395
74.9%

Length

2022-01-15T10:25:06.129663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1244049
 
3.1%
1342054
 
3.0%
1140839
 
2.9%
1439678
 
2.8%
1537129
 
2.6%
1634801
 
2.5%
1732394
 
2.3%
1830377
 
2.1%
1928456
 
2.0%
2027091
 
1.9%
Other values (10070)1062395
74.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Tot_Bene_Day_Srvcs
Categorical

HIGH CARDINALITY

Distinct5073
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
12
 
47081
11
 
44782
13
 
44557
14
 
41677
15
 
38851
Other values (5068)
1202315 

Length

Max length9
Median length2
Mean length2.291354034
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2005 ?
Unique (%)0.1%

Sample

1st row98
2nd row22
3rd row28
4th row25
5th row99

Common Values

ValueCountFrequency (%)
1247081
 
3.3%
1144782
 
3.2%
1344557
 
3.1%
1441677
 
2.9%
1538851
 
2.7%
1636380
 
2.6%
1733796
 
2.4%
1831659
 
2.2%
1929477
 
2.1%
2028144
 
2.0%
Other values (5063)1042859
73.5%

Length

2022-01-15T10:25:06.450496image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1247081
 
3.3%
1144782
 
3.2%
1344557
 
3.1%
1441677
 
2.9%
1538851
 
2.7%
1636380
 
2.6%
1733796
 
2.4%
1831659
 
2.2%
1929477
 
2.1%
2028144
 
2.0%
Other values (5063)1042859
73.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Avg_Sbmtd_Chrg
Categorical

HIGH CARDINALITY

Distinct128266
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
$150.00
 
13848
$50.00
 
13005
$100.00
 
11912
$30.00
 
11504
$40.00
 
10444
Other values (128261)
1358550 

Length

Max length10
Median length7
Mean length6.764222699
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique59441 ?
Unique (%)4.2%

Sample

1st row$92.00
2nd row$350.00
3rd row$96.00
4th row$499.00
5th row$82.73

Common Values

ValueCountFrequency (%)
$150.0013848
 
1.0%
$50.0013005
 
0.9%
$100.0011912
 
0.8%
$30.0011504
 
0.8%
$40.0010444
 
0.7%
$200.0010412
 
0.7%
$25.0010097
 
0.7%
$35.009661
 
0.7%
$60.009526
 
0.7%
$75.009014
 
0.6%
Other values (128256)1309840
92.3%

Length

2022-01-15T10:25:06.776262image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
150.0013848
 
1.0%
50.0013005
 
0.9%
100.0011912
 
0.8%
30.0011504
 
0.8%
40.0010444
 
0.7%
200.0010412
 
0.7%
25.0010097
 
0.7%
35.009661
 
0.7%
60.009526
 
0.7%
75.009014
 
0.6%
Other values (128256)1309840
92.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Avg_Mdcr_Alowd_Amt
Categorical

HIGH CARDINALITY

Distinct68010
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
$2.94
 
9681
$2.44
 
5874
$18.65
 
5096
$3.41
 
4769
$10.57
 
4536
Other values (68005)
1389307 

Length

Max length10
Median length6
Mean length6.224355176
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28828 ?
Unique (%)2.0%

Sample

1st row$30.63
2nd row$81.68
3rd row$48.82
4th row$165.77
5th row$54.65

Common Values

ValueCountFrequency (%)
$2.949681
 
0.7%
$2.445874
 
0.4%
$18.655096
 
0.4%
$3.414769
 
0.3%
$10.574536
 
0.3%
$54.894179
 
0.3%
$4.283363
 
0.2%
$8.463342
 
0.2%
$0.112881
 
0.2%
$11.512542
 
0.2%
Other values (68000)1373000
96.7%

Length

2022-01-15T10:25:07.090596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2.949681
 
0.7%
2.445874
 
0.4%
18.655096
 
0.4%
3.414769
 
0.3%
10.574536
 
0.3%
54.894179
 
0.3%
4.283363
 
0.2%
8.463342
 
0.2%
0.112881
 
0.2%
11.512542
 
0.2%
Other values (68000)1373000
96.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Avg_Mdcr_Pymt_Amt
Categorical

HIGH CARDINALITY

Distinct59738
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
$2.94
 
9688
$2.44
 
5811
$18.65
 
5100
$3.41
 
4792
$10.57
 
4570
Other values (59733)
1389302 

Length

Max length10
Median length6
Mean length6.069952504
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25313 ?
Unique (%)1.8%

Sample

1st row$24.49
2nd row$65.30
3rd row$30.35
4th row$98.33
5th row$54.65

Common Values

ValueCountFrequency (%)
$2.949688
 
0.7%
$2.445811
 
0.4%
$18.655100
 
0.4%
$3.414792
 
0.3%
$10.574570
 
0.3%
$54.894261
 
0.3%
$0.093399
 
0.2%
$4.283391
 
0.2%
$8.463240
 
0.2%
$11.512509
 
0.2%
Other values (59728)1372502
96.7%

Length

2022-01-15T10:25:07.406418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2.949688
 
0.7%
2.445811
 
0.4%
18.655100
 
0.4%
3.414792
 
0.3%
10.574570
 
0.3%
54.894261
 
0.3%
0.093399
 
0.2%
4.283391
 
0.2%
8.463240
 
0.2%
11.512509
 
0.2%
Other values (59728)1372502
96.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Avg_Mdcr_Stdzd_Amt
Categorical

HIGH CARDINALITY

Distinct53642
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
$16.60
 
24613
$2.94
 
14065
$115.84
 
9099
$2.44
 
7300
$10.57
 
6615
Other values (53637)
1357571 

Length

Max length10
Median length6
Mean length6.07057043
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21509 ?
Unique (%)1.5%

Sample

1st row$23.45
2nd row$62.44
3rd row$32.58
4th row$97.31
5th row$54.65

Common Values

ValueCountFrequency (%)
$16.6024613
 
1.7%
$2.9414065
 
1.0%
$115.849099
 
0.6%
$2.447300
 
0.5%
$10.576615
 
0.5%
$82.796386
 
0.4%
$3.415831
 
0.4%
$57.925620
 
0.4%
$58.205278
 
0.4%
$8.465201
 
0.4%
Other values (53632)1329255
93.7%

Length

2022-01-15T10:25:07.696848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
16.6024613
 
1.7%
2.9414065
 
1.0%
115.849099
 
0.6%
2.447300
 
0.5%
10.576615
 
0.5%
82.796386
 
0.4%
3.415831
 
0.4%
57.925620
 
0.4%
58.205278
 
0.4%
8.465201
 
0.4%
Other values (53632)1329255
93.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-01-15T10:24:30.292253image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-15T10:24:26.887677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-15T10:24:28.558166image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-15T10:24:30.841269image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-15T10:24:27.461610image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-15T10:24:29.104492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-15T10:24:31.473714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-15T10:24:28.001146image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-15T10:24:29.690402image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-01-15T10:25:07.911366image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-15T10:25:08.195220image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-15T10:25:08.504663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-15T10:25:08.815377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-01-15T10:25:09.210352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-01-15T10:24:33.358402image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-15T10:24:39.895482image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-01-15T10:24:50.347425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-01-15T10:24:53.092275image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Rndrng_NPIRndrng_Prvdr_Last_Org_NameRndrng_Prvdr_First_NameRndrng_Prvdr_MIRndrng_Prvdr_CrdntlsRndrng_Prvdr_GndrRndrng_Prvdr_Ent_CdRndrng_Prvdr_St1Rndrng_Prvdr_St2Rndrng_Prvdr_CityRndrng_Prvdr_State_AbrvtnRndrng_Prvdr_State_FIPSRndrng_Prvdr_Zip5Rndrng_Prvdr_RUCARndrng_Prvdr_RUCA_DescRndrng_Prvdr_CntryRndrng_Prvdr_TypeRndrng_Prvdr_Mdcr_Prtcptg_IndHCPCS_CdHCPCS_DescHCPCS_Drug_IndPlace_Of_SrvcTot_BenesTot_SrvcsTot_Bene_Day_SrvcsAvg_Sbmtd_ChrgAvg_Mdcr_Alowd_AmtAvg_Mdcr_Pymt_AmtAvg_Mdcr_Stdzd_Amt
01003000134CibullThomasLM.D.MI2650 Ridge AveEvanston HospitalEvanstonIL17.0602011.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSPathologyY88341Special stained specimen slides to examine tissueNF9332298$92.00$30.63$24.49$23.45
11003000134CibullThomasLM.D.MI2650 Ridge AveEvanston HospitalEvanstonIL17.0602011.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSPathologyY88348Electron microscopy for diagnosisNF222222$350.00$81.68$65.30$62.44
21003000142KhalilRashidNaNM.D.MI4126 N Holland Sylvania RdSuite 220ToledoOH39.0436231.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSAnesthesiologyY99213Established patient office or other outpatient visit, typically 15 minutesNF242828$96.00$48.82$30.35$32.58
31003000522WeigandFrederickJMDMI1565 Saxon BlvdSuite 102DeltonaFL12.0327251.1Secondary flow 30% to <50% to a larger urbanized area of 50,000 and greaterUSFamily PracticeY99204New patient office or other outpatient visit, typically 45 minutesNO252525$499.00$165.77$98.33$97.31
41003000530SemoncheAmandaMDOFI1021 Park AveSuite 203QuakertownPA42.0189511.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSInternal MedicineY90662Vaccine for influenza for injection into muscleYO939999$82.73$54.65$54.65$54.65
51003000530SemoncheAmandaMDOFI1021 Park AveSuite 203QuakertownPA42.0189511.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSInternal MedicineY99496Transitional care management services, highly complexity, requiring face-to-face visits within 7 days of dischargeNO111313$404.23$245.10$183.94$173.07
61003000597KimDaeYM.D., PH.DMI1145 S Utica AveSuite 202TulsaOK40.0741041.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSUrologyY51702Insertion of indwelling bladder catheterNO162222$212.00$58.49$44.64$47.68
71003000597KimDaeYM.D., PH.DMI1145 S Utica AveSuite 202TulsaOK40.0741041.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSUrologyY51741Electronic assessment of bladder emptyingNO869696$34.00$13.42$10.16$10.67
81003000597KimDaeYM.D., PH.DMI1145 S Utica AveSuite 202TulsaOK40.0741041.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSUrologyY51798Ultrasound measurement of bladder capacity after voidingNO199233233$32.00$11.56$9.05$9.98
91003000597KimDaeYM.D., PH.DMI1145 S Utica AveSuite 202TulsaOK40.0741041.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSUrologyY52204Biopsy of the bladder using an endoscopeNF222222$745.00$134.65$106.14$107.04

Last rows

Rndrng_NPIRndrng_Prvdr_Last_Org_NameRndrng_Prvdr_First_NameRndrng_Prvdr_MIRndrng_Prvdr_CrdntlsRndrng_Prvdr_GndrRndrng_Prvdr_Ent_CdRndrng_Prvdr_St1Rndrng_Prvdr_St2Rndrng_Prvdr_CityRndrng_Prvdr_State_AbrvtnRndrng_Prvdr_State_FIPSRndrng_Prvdr_Zip5Rndrng_Prvdr_RUCARndrng_Prvdr_RUCA_DescRndrng_Prvdr_CntryRndrng_Prvdr_TypeRndrng_Prvdr_Mdcr_Prtcptg_IndHCPCS_CdHCPCS_DescHCPCS_Drug_IndPlace_Of_SrvcTot_BenesTot_SrvcsTot_Bene_Day_SrvcsAvg_Sbmtd_ChrgAvg_Mdcr_Alowd_AmtAvg_Mdcr_Pymt_AmtAvg_Mdcr_Stdzd_Amt
14192531992999122JohnsonCharlesRD.O.MI1601 Clint Moore Rd155Boca RatonFL12.0334871.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSInternal MedicineYG0403Electrocardiogram, routine ecg with 12 leads; performed as a screening for the initial preventive physical examination with interpretation and reportNO252525$25.00$17.85$9.64$9.22
14192541992999122JohnsonCharlesRD.O.MI1601 Clint Moore Rd155Boca RatonFL12.0334871.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSInternal MedicineYG0439Annual wellness visit, includes a personalized prevention plan of service (pps), subsequent visitNO467467467$150.11$119.93$119.93$115.87
14192551992999270BennettStephanieRPA-CFI5999 Dundee RdSuite 750Winter HavenFL12.0338841.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSPhysician AssistantY99214Established patient office or other outpatient, visit typically 25 minutesNO384444$273.00$90.62$53.80$55.62
14192561992999437Rivera MendezJoseLM.D.MI1600 S Andrews AveNaNFort LauderdaleFL12.0333161.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSEmergency MedicineY99284Emergency department visit, problem of high severityNF232323$1,236.00$131.28$100.00$89.73
14192571992999551MolaiIndiraNaNM.D.FI625 E Grand AveNaNEscondidoCA6.0920251.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSInternal MedicineY81003Automated urinalysis testNO475656$6.66$2.44$2.44$2.44
14192581992999551MolaiIndiraNaNM.D.FI625 E Grand AveNaNEscondidoCA6.0920251.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSInternal MedicineY90653Vaccine for influenza for injection into muscleYO505050$120.00$58.34$58.34$58.34
14192591992999775Spine Surgery Center Of Eugene, LlcNaNNaNNaNNaNO1410 Oak StSuite 300EugeneOR41.0974011.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSAmbulatory Surgical CenterY63047Partial removal of middle spine bone with release of spinal cord and/or nervesNF232323$15,000.00$2,955.25$2,352.15$2,149.15
14192601992999825DeschenesGeoffreyRM.D.MI1100 9th AveMs:m4-PfsSeattleWA53.0981011.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSOtolaryngologyY30140Removal of nasal air passageNF111111$1,819.50$159.64$117.02$108.50
14192611992999825DeschenesGeoffreyRM.D.MI1100 9th AveMs:m4-PfsSeattleWA53.0981011.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSOtolaryngologyY99203New patient office or other outpatient visit, typically 30 minutesNF292929$150.00$81.53$60.69$56.54
14192621992999825DeschenesGeoffreyRM.D.MI1100 9th AveMs:m4-PfsSeattleWA53.0981011.0Metropolitan area core: primary flow within an urbanized area of 50,000 and greaterUSOtolaryngologyY99214Established patient office or other outpatient, visit typically 25 minutesNF92115115$150.19$83.80$63.96$59.98